This is an open access article published under an ACS AuthorChoice License, which permits 
copying and redistribution of the article or any adaptations for non-commercial purposes. 


Sea 


pubs.acs.org/est 


Gridded National Inventory of U.S. Methane Emissions 


Joannes D. Maasakkers,*”' Daniel J. Jacob,’ Melissa P. Sulprizio,’ Alexander J. Turner,’ Melissa Weitz," 
Tom Wirth,” Cate Hight,” Mark DeFigueiredo,” Mausami Desai,” Rachel Schmeltz," Leif Hockstad,* 
Anthony A. Bloom,! Kevin W. Bowman, | Seongeun Jeong,® and Marc L. Fischer 


*School of Engineering and Applied Sciences, Harvard University, Pierce Hall, 29 Oxford Street, Cambridge, Massachusetts 02138, 


United States 


Climate Change Division, Environmental Protection Agency, Washington, District of Columbia 20460, United States 


Net Propulsion Laboratory, California Institute of Technology, Pasadena, California 91109, United States 


‘Energy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States 


ABSTRACT: We present a gridded inventory of US 
anthropogenic methane emissions with 0.1° x 0.1° spatial 
resolution, monthly temporal resolution, and detailed scale- 
dependent error characterization. The inventory is designed to 
be consistent with the 2016 US Environmental Protection 
Agency (EPA) Inventory of US Greenhouse Gas Emissions 
and Sinks (GHGI) for 2012. The EPA inventory is available 
only as national totals for different source types. We use a wide 
range of databases at the state, county, local, and point source 
level to disaggregate the inventory and allocate the spatial and 
temporal distribution of emissions for individual source types. 
Results show large differences with the EDGAR v4.2 global 
gridded inventory commonly used as a priori estimate in 
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inversions of atmospheric methane observations. We derive grid-dependent error statistics for individual source types from 
comparison with the Environmental Defense Fund (EDF) regional inventory for Northeast Texas. These error statistics are 
independently verified by comparison with the California Greenhouse Gas Emissions Measurement (CALGEM) grid-resolved 
emission inventory. Our gridded, time-resolved inventory provides an improved basis for inversion of atmospheric methane 
observations to estimate US methane emissions and interpret the results in terms of the underlying processes. 


@ INTRODUCTION 


Under the United Nations Framework Convention on Climate 
Change (UNFCCC), individual countries must report their 
national anthropogenic greenhouse gas emissions calculated 
using comparable methods.’ The Intergovernmental Panel on 
Climate Change (IPCC)* provides three different methods or 
“tiers” for calculating emissions. All are bottom-up approaches 
in which emissions from individual source types are generally 
calculated as the product of activity data and emission factors. 
Increasing tiers are more detailed and require more country- 
specific data. In the United States, the Environmental 
Protection Agency (EPA) produces an annual Inventory of 
US Greenhouse Gas Emissions and Sinks (GHGI)* for 
reporting to the UNFCCC. The GHGI uses detailed 
information on activity data and emission factors, generally 
following IPCC Tier 2 and 3 methods. It provides detailed 
sectoral breakdown of emissions but only reports national totals 
for most source types. Here we present a spatially disaggregated 
version of the GHGI at 0.1° X 0.1° spatial resolution and 
monthly temporal resolution, including detailed information 
and error characterization for individual emission types. Our 
goal is to enable the use of the GHGI as an a priori estimate for 
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inversions of atmospheric methane that may guide improve- 
ments in the inventory. 

Table 1 gives the GHGI estimates for 2012 with method- 
ology updated in 2016° and including contributions from 
different source types. Total US anthropogenic emission is 29.0 
Tg a’, including major contributions from natural gas systems 
(24%), enteric fermentation (23%), landfills (20%), coal mining 
(9%), manure management (9%), and petroleum (or 
equivalently oil) systems (8%). The inventory includes forest 
fire emissions but no other natural sources. The main natural 
source of methane is thought to be wetlands, accounting for 8.5 
+ 5 Tg A?! in the contiguous US (CONUS).* Annual 
anthropogenic emissions from 1990 to 2014 computed by 
EPA’ with a consistent method (revised each year to include 
updated information) show no significant trend and little 
interannual variability, with US totals staying in the range 28.6- 
31.2 Tg a”! and contributions from individual source types 
varying by only a few percent. 
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Table 1. Inventories of US Anthropogenic Methane 
Emissions (Gg a7')* 


EDGAR v4.2 
source type EPA GHGI (2012) (2008) 
agriculture 
enteric fermentation 6670 ($936—7871) 6720 
manure management 2548 (2089-3058) 2200 
rice cultivation 476 (395-557) 418 
field burning of 11 (7-15) 38 
agricultural residues 
natural gas systems 6906 (5594-8978) 4758 
production 4442 
processing 890 
transmission and storage 1116 
distribution 487 
waste 
landfills 5691 (3528-9333) 5230 
municipal $098 
industrial $93 
wastewater treatment 601 (367-613) 887 
domestic 368 
industrial 232 
composting 77 (39-116) 83 
coal mines 
coal mining 2658 (2339-3057) 4140 
underground 2159 
surface 499 
abandoned coal mines 249 (204-309) 
petroleum systems 2335 (1775-5814) 1032 
other 
forest fires 443 (62-1214) 17 
stationary combustion 265 (156-676) 424 
mobile combustion 86 (76-101) 104 
petrochemical 3 (1-4) 24 
production 
ferroalloy production 1 (1-1) 1 
total 29020 (26698-36565) 26075 


“Column two shows the EPA inventory of US Greenhouse Gas 
Emissions and Sinks (GHGI) for 2012 as updated in 2016.° 95% 
confidence intervals are in parentheses as provided by EPA, sometimes 
only for broad source categories. Column three shows the US 
component of the global EDGAR v4.2 inventory for 2008.’ The 
gridded version of the EPA GHGI developed in this work includes 
separate files for all entries in this table. 


Application of atmospheric methane observations to estimate 
emissions usually involves inversion of an atmospheric 
transport model, with consideration of a priori information 
from an emission inventory to regularize the results and achieve 
a Bayesian optimal estimate of emissions.°° The inversion 
optimizes emissions on a grid, and the inventory used as a 
priori information must be available on that grid. In the absence 
of a gridded version of the GHGI, previous inverse studies for 
the US have relied on the global EDGAR inventory’ which 
provides annual emissions at 0.1° X 0.1° resolution. EDGAR 
uses IPCC Tier 1 methods with international data sets, and 
only includes a limited breakdown by source type. National 
totals in EDGAR are generally consistent with EPA, as shown 
in Table 1, but we will see that there are large errors in spatial 
allocation that affect inverse analyses and their interpretation. 
Our gridded version of the GHGI not only provides a better a 
priori estimate but also a better basis for interpreting inversion 


results and hence improving our understanding of the 
underlying processes. 


@ METHODS 


We disaggregate the 2012 national emissions reported by the 
2016 version of the GHGI’ into a gridded 0.1° X 0.1° monthly 
inventory. The gridded inventory is consistent with the EPA 
national emission totals for each source type (each entry in 
Table 1) and distributes these emissions based on information 
at the state, county, subcounty, and point source levels. In this 
manner, our inventory is a gridded representation of the 
national GHGI. Similar disaggregation has been done for 
national methane inventories in Switzerland,*” Australia, ° and 
the United Kingdom."' We limit our domain to the CONUS, 
which accounts for over 98% of total US emissions on the basis 
of our state-level estimates. We use the 2012 emissions from 
the 2014 EPA GHGI published in 2016, which includes 
detailed descriptions of the methods used to calculate the 
national emissions.’ The 2014 GHGI includes updates to the 
petroleum and natural gas emissions to reflect new studies.” 
We focus on the year 2012 as the latest year for which all spatial 
activity data are available. Updating our gridded inventory to 
newer iterations (the GHGI is updated annually) and later 
years will be straightforward as new activity data are released. 

We start from the most detailed spatial information directly 
available from the GHGI. This information varies by source 
type. Livestock emissions are available for each state, whereas 
waste and petroleum systems emissions are only available as 
national totals. Separate from the national inventory, EPA also 
collects methane emission and supporting data from large 
facilities under the Greenhouse Gas Reporting Program 
(GHGRP).’° Facilities with emissions greater than 25 Gg 
CO, equivalent a7! (corresponding to 0.11 tons h™! for a pure 
methane source) and subject to the applicable regulatory 
requirements must report to the GHGRP. Some emissions 
reported to the GHGRP are directly measured (e.g., under- 
ground coal mines), while others are calculated on the basis of 
facility-level activity data (e.g., landfills). Where possible, we use 
facility-level emissions from the GHGRP but those sometimes 
need to be adjusted, as discussed below, to be consistent with 
the national inventory. 

Agriculture. Emissions from agriculture include enteric 
fermentation, manure management, rice cultivation, and field 
burning of agricultural residues. EPA provides annual state-level 
enteric fermentation and manure management emissions for 
different animal types, taking into account varying practices 
across the country. We estimate county-level emissions by 
using livestock numbers for 14 different animal types (including 
different types of cattle) from the 2012 US Department of 
Agriculture Census of Agriculture for each animal type.'* 
County-level emissions are allocated to the 0.1° x 0.1° grid 
using 9 different livestock occurrence probability maps (again 
distinguishing between different types of cattle) from USDA 
based on landtype.'* Emissions from enteric fermentation are 
assumed to have no intra-annual variability. Emissions from 
manure management vary with temperature as given by'® 


2 oy] AC - 2| 


RTTn (1) 
where f is a monthly scaling factor, A = 64 kJ mol"! is the 
activation energy, R is the ideal gas constant, T,, is the monthly 
average surface skin (radiant) temperature, and T, = 303 KY 
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Monthly emissions are calculated by scaling annual emissions 
with normalized monthly f-fields using 0.625° xX 0.S° monthly 
average surface skin temperature fields from the NASA 
MERRA-2 meteorological data.'’ Livestock emissions also 
vary subanually as a function of varying herd size, and 
management practices but those effects are not included in 
our inventory. 

Annual state-level emissions from rice cultivation are 
obtained from EPA and allocated to counties using acreage 
harvested from the USDA Census.'* Emissions for each county 
are allocated to the 0.1° X 0.1° grid based on crop maps with 
30 m resolution from the USDA Cropland Data Layer 
product.'* Annual emissions are then distributed over 
individual months using normalized mean 2001-2010 
heterotrophic respiration rates from the 1° X 1° monthly 
Carbon Data-Model Framework (CARDAMOM) terrestrial C 
cycle analysis. '” 

Emissions from field burning of agricultural residues of five 
individual crops (corn, rice, soybeans, sugar cane, and wheat) 
are allocated to a 2003-2007 monthly climatology of 
agricultural fires.”° 

Natural Gas Systems. This source type includes emissions 
from natural gas production, processing, transmission, and 
distribution. It does not include emissions from abandoned 
wells.’ Emissions from natural gas production are available 
from EPA for each of the six National Energy Modeling System 
(NEMS) regions defined by the US Energy Information 
Administration (EIA).~ The GHGI attributes emissions to 
different activities (e.g., vessel blowdowns, well workovers, 
liquid unloading) and equipment (e.g., pneumatic devices). 
Detailed maps of these activities and equipment are not 
available. Therefore, we rely on monthly well data obtained 
from DrillingInfo.”* Separate DrillingInfo data are available for 
the number of gas producing wells, nonassociated gas wells 
(gas-to-oil ratio over 100 mcf gas per barrel), coalbed methane 
wells, and coalbed methane well water production. We also 
distinguish conventional and unconventional wells as some 
emissions are specific to hydraulic fracturing. A well is flagged 
as unconventional if the drilling direction is horizontal as given 
by DrillingInfo or if the reservoir type is coalbed, low 
permeability, or shale.’ For each NEMS region, we allocate 
emissions using the DrillingInfo-based maps best representative 
of the spatial distribution of the considered activity or 
equipment. State-level condensate production from EIA” is 
combined with nonassociated gas well maps to allocate 
emissions from condensate tank vents. Three gas-producing 
states (Illinois, Indiana, and Tennessee) do not have active 
wells in the DrillingInfo database and amount to less than 1% 
of national active gas wells.”° For these states we use state-level 
data on the number of natural gas wells” to calculate state 
emissions and then use county-level gas production~® and 
finally three different well databases to grid emissions. ~~” For 
offshore emissions, the 2011 Gulfwide Offshore Activity Data 
System (GOADS) platform-level emission database is used for 
the Gulf of Mexico””° and DrillingInfo is used outside of the 
Gulf of Mexico, scaling total emissions to the national emission 
from the GHGI. As no national spatial data are available for 
gathering processes (only a subset report to the GHGRP”’), 
emissions from these processes are included in the production 
sector and gridded in the same way. 

Emissions from gas processing are only available as national 
totals in the GHGI. We allocate emissions to processing plants 
by combining the GHGRP data’’ with the EIA database for 


these plants.” The GHGRP covers 85% of the processed gas 
flow from the EIA database. For the remaining plants in the 
EIA database, emissions are estimated by multiplying their gas 
flow with the average ratio of methane emissions to gas flow of 
the GHGRP plants. Subsequently, emissions from all plants are 
scaled to match the national GHGI number; the scaling is 
required because the GHGRP does not include all emitting 
processes occurring at the plants and has different emission 
estimates per process.’ Thus, we only use the GHGRP to 
allocate emissions in a relative sense with emission magnitudes 
constrained by the GHGI. The EIA database only provides 
postal codes for the processing plants and not coordinates; we 
determine non-GHGRP plant coordinates from the Rextag 
Strategies US Natural Gas Pipeline and Infrastructure Wall 
Map.” If there is no match with the GHGRP or Rextag data, 
emissions from the plant in the EIA database are spread out 
over the associated postal code area.** 

EPA provides national emissions for different parts of the 
transmission sector. Most important are transmission com- 
pressor stations, for which we use a similar mapping as for 
processing plants. The GHGRP data for individual compressor 
stations are complemented with the EIA database for 
nonreporting compressor stations.’ Emissions for nonreport- 
ing compressor stations are estimated based on their 
throughput, * using the average ratio of throughput to methane 
emission from the GHGRP data. Emissions are then scaled to 
the national total so our results are not affected by potential 
underestimates in the GHGRP emissions.*° Similarly, a 
database of storage stations’ is combined with the GHGRP 
using total field capacity to predict emissions. Locations are 
based on the GHGRP, supplemented by gas storage field 
locations georeferenced from the Rextag Strategies US Natural 
Gas Pipeline and Infrastructure Wall Map,” and DrillingInfo.”* 
Similar approaches are also used for liquid natural gas (LNG) 
storage’ and LNG import terminals.*” Emissions from pipeline 
leaks and transmission meter and regulator stations are 
allocated to the network of interstate and intrastate pipelines.*° 
Emissions from farm taps are allocated to pipelines intersecting 
with agricultural land.'* Emissions related to storage at wells are 
mapped to all nonassociated gas wells.” 

Emissions from different parts of the distribution network are 
available from EPA as national estimates. State-level emissions 
from distribution pipeline leaks are calculated using state data 
on pipeline miles and services from the Pipeline and Hazardous 
Materials Safety Administration (PHMSA) of which the sum is 
used for the national GHGI.”’ This takes into account different 
materials (e.g., cast iron, plastic) with different emission factors. 
Emissions from distribution meter and regulator stations are 
divided among states using state-level aggregated GHGRP 
information (no finer spatial information is available). Other 
distribution emissions are partitioned between the states based 
on leaked gas volume data from EIA.” Within states, emissions 
are mapped to 0.1° X 0.1° population data from the 2010 US 
Census.** 

Waste. Waste emissions include landfills, wastewater 
treatment, and composting, for which EPA provides national 
totals following the categories in Table 1. We allocate emissions 
from landfills based on a combination of data from the GHGRP 
(1231 municipal landfills, 175 industrial landfills), the Landfill 
Methane Outreach Program (LMOP, municipal landfills 
only),** and the Facility Registration Service (FRS).*° 
GHGRP landfills are assigned their reported emissions. 900 
of 2049 LMOP landfills do not report emissions to the 
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GHGRP. For those we estimate emissions from GHGRP- 
reporting landfills with similar attributes (presence of a 
collection system, flares). Some landfills report landfill gas 
production through the LMOP. For the other landfills, waste in 
place is used as estimation metric combined with a decay factor 
for landfills that closed before 2012.*° For landfills without any 
data (108), we assign the median emissions from the landfills 
for which information was available. Finally, we use landfills 
with known coordinates from the FRS that are not present in 
the GHGRP or LMOP data sets. We decide whether a landfill 
is municipal or industrial based on keyword descriptors in the 
databases. The 722 municipal FRS landfills are assigned the 
median emission derived from above, after which all non- 
GHGRP emissions are scaled to match the national emission 
estimate. For industrial landfills, the national estimate minus 
the GHGRP emissions is uniformly allocated across the 2309 
industrial landfills from the FRS. 

Emissions from wastewater treatment are reported as 
municipal or industrial in the GHGI. Facilities that report to 
the GHGRP account for 84% of the national industrial 
wastewater treatment emissions. To allocate the remaining 
industrial emissions as well as the municipal emissions, we use 
facility-level wastewater flow data from the Clean Watersheds 
Needs Survey.*” Industrial wastewater treatment emissions are 
mapped to treated industrial flow, emissions from municipal 
septic systems to decentralized municipal flow, and centralized 
municipal systems emissions to centralized municipal flow. 

State-level emissions from composting are calculated using 
the tonnage of municipal solid waste composted or, if 
composting data are not available, from the correlated tonnage 
recycled.** Within states, emissions are allocated to locations 
from the US Composting Council,” BioCycle composter 
database,” and composting entries in the FRS.*° If there are 
fewer than three facilities found in a state, we allocate based on 
gridded population instead.** 

Coal Mines. We allocate coal mining emissions using state- 
level emission estimates produced for the GHGI (M. Coté, 
Ruby Canyon Engineering, unpublished data) for underground 
mines and surface mines. These estimates account for methane 
recovered or destroyed, as well as postmining emissions 
(methane released during coal handling and processing). We 
use the locations and production of all active surface and 
underground coal mines from EIA.’ A large number of 
underground mines report their annual methane emissions to 
GHGRP. We estimate emissions from nonreporting under- 
ground mines based on their share of the state total coal 
production combined with the state-level emissions, weighted 
by the basin-level in situ methane content of the coal for states 
that have mines in multiple basins.’ Subsequently, we scale the 
emissions from nonreporting mines so that the total national 
emissions (including the GHGRP mines) match the GHGI. 
For surface mines, no GHGRP data are available. Emissions are 
allocated using the EPA state-level data as given above, 
combined with EPA basin-level emission factors and EIA 
mine-level production data. Similarly, postmining emissions are 
allocated to all mines based on their production and basin- 
specific emission factors. 

The GHGI also includes emissions from abandoned coal 
mines. We start from the Abandoned Coal Mine Methane 
Opportunities Database (ACMMOD)® and add recently 
closed coal mines plus county-level estimates of mine closures 
before 1972 not included in ACMMOD (unpublished data 
produced for EPA by Ruby Canyon Engineering). For all 


closed mines, estimates of closure dates, status (venting, sealed, 
flooded), and estimates of emissions when the mine was active 
are available or estimated from county-level averages, allowing 
the estimation of present-day emissions based on decline 
equations used in the GHGI.°’ ACMMOD only includes mine 
locations on the county level. Precise locations of approx- 
imately one-third of the abandoned mines are found in the Full 
Mine Info data set.°* The remaining emissions are allocated on 
the county level. 

Petroleum Systems. The GHGI includes national 
emissions from different activities and equipment related to 
petroleum production, refining, and transport. We use monthly 
well data for several production quantities from DrillingInfo~ 
to spatially allocate these emissions. These include total, heavy, 
and light oil production, with the cutoff between the last two at 
an American Petroleum Institute (API) gravity of 20. EPA 
estimates some emissions separately for heavy and light oil 
production. Furthermore, we created maps of oil wells (defined 
as wells with a produced gas-to-oil ratio under 100 mcf per 
barrel, wells with a higher ratio are classified as nonassociated 
gas wells), stripper wells (producing fewer than 10 barrels per 
day), and total and unconventional oil well completions. Similar 
to the allocation of natural gas production emissions, states 
without active wells in DrillingInfo are represented using state- 
specific data sets and amount to less than 1% of national 
production.*° For the other states, the national-level emissions 
from each activity and device are allocated using the 
DrillingInfo maps. For example, emissions from well drilling 
are mapped to well completions, while emissions from heavy 
crude oil wellheads are mapped to heavy oil wells. As for natural 
gas systems, offshore emissions are based on the GOADS 
database for the Gulf of Mexico and DrillingInfo elsewhere. 
Emissions from petroleum refining are allocated to GHGRP 
facilities based on their reported emissions. National emissions 
from petroleum transportation are divided between the wells, 
offshore platforms, and refineries. 

Other. Other refers to a number of smaller sources listed in 
Table 1. National forest fire emissions from the GHGI are 
distributed on a daily basis at 0.1° X 0.1° resolution using the 
Quick Fire Emissions Data set (QFED v2.4) for 2012.°° 
Stationary combustion emissions from electricity generation are 
calculated by multiplying plant-level heat inputs from the Acid 
Rain Program with fuel type specific emission factors.°’ 
Additional stationary combustion emissions from the industrial, 
commercial, and residential sectors are based on state-level 
consumption of different types of fuel (coal, fuel oil, natural gas, 
and wood) as reported by EIA.°* Within states, residential and 
commercial emissions are allocated based on population while 
industrial emissions are allocated based on combustion 
emissions reported to the GHGRP. National on-road mobile 
combustion emissions for individual vehicle types in the GHGI 
are allocated spatially by first calculating state-level vehicle miles 
traveled (VMT) for six types of roads: urban and rural for each 
of primary, secondary, and other (minor) roads’ and 
attributing those to individual vehicle types.” These state 
totals are then mapped to the different road networks taken 
from the National Transportation Atlas°° and US Census 
products.°! Combustion emissions from rail transport are 
allocated over the US railroad network.°’ Emissions from 
agricultural equipment are uniformly spread out across all 
agricultural land.'* Emissions from mining-related vehicles are 
allocated to active mines.°* Construction and “other” mobile 
combustion emissions are mapped based on population. 
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National emissions from ferroalloy production and from the 
petrochemical industry are divided over the facilities reporting 
to the GHGRP based on their total reported emissions. 


M@ RESULTS AND DISCUSSION 


Figure 1 shows the distribution of annual emissions on the 0.1° 
X 0.1° grid for the six general emission categories of the 


Agriculture ems - 6.8 Tg 


PO, 


Coal Mines = 2.9Tg 


SA, 


Petroleum Systems - 2.2 Tg 
B = 


af 


P| 


CH, emissions (Mg a’ km? ) 


Figure 1. Contiguous US (CONUS) methane emissions from 
different source categories. Total annual US emissions from the 
2016 EPA GHGI for 2012 are disaggregated here on a 0.1° X 0.1° grid. 
“Other” refers to the ensemble of minor sources in Table 1. (An 
equivalent figure for EDGAR v4.2 is shown by in Turner et al.”°), 


Methods section. Emissions from agriculture are broadly 
distributed across livestock farming areas. Hotspots are mostly 
from concentrated dairy cattle or hog populations such as in 
Iowa, North Carolina, and California. Rice cultivation 
contributes hotspots in northern California and along the 
lower Mississippi River. Emissions from natural gas systems are 
high in production fields, for example in Pennsylvania 
(Marcellus shale) and in Texas, with a maximum at Four 
Corners as found in top-down studies.°* Waste emissions 
(dominated by landfills; Table 1) roughly map to population, 
with hotspots from large landfills and wastewater facilities. Coal 
mining emissions are concentrated in Appalachia. Petroleum 
systems emissions peak over the Bakken region in North 
Dakota and western Texas where natural gas emissions are low. 
Other emissions mostly feature forest fire hotspots in the West 
and stationary combustion emissions in populated areas. 
Total CONUS emissions for 2012 are 28.7 Tg al, slightly 
lower than the 29.0 Tg a7! national total reported in Table 1 
because of contributions from Alaska, Hawaii, and outside 
territories. Several sources vary monthly in our inventory 
including manure management, natural gas and petroleum 
production, stationary combustion, and forest fires (daily). 
Monthly emissions vary from 73 Gg per day in December to 89 
Gg per day in July. Most of this monthly variation arises from 
manure management, which varies nationally from 2.4 Gg per 
day in January to 16.8 Gg per day in July. Eq 1 is for liquid 


storage systems but is applied here to all manure management 
systems, which may overestimate the seasonal variation.” 
For rice emissions, we assume a constant methane to CO, 
emission ratio from heterotrophic respiration, which may 
underestimate the seasonal variation as the ratio has been found 
to increase with temperature in wetlands and aquatic 
ecosystems.°° On the other hand, some seasonal factors are 
not considered in our inventory due to lack of data such as 
livestock numbers, feed, and gas/petroleum distribution. 
Transient elevated emissions from oil/gas systems (the so- 
called “super-emitters””’) are also not resolved. 

Figure 2 compares the distribution of total methane 
emissions in our gridded EPA inventory for 2012 to the 
EDGAR v4.2 inventory for 2008, the latest year of full release.’ 
A fast track version of EDGAR (v4.2 FT2010’) has come out 
since but, based on visual inspection, the spatial emissions 
patterns in EDGAR v4.2 are of higher quality and most inverse 
studies have used EDGAR v4.2. There are large differences in 
spatial patterns between the Gridded EPA inventory and 
EDGAR v4.2, particularly for oil/gas systems and manure 
management. Emissions in the gridded EPA inventory are 
much higher over oil/gas production areas and lower over 
distribution (populated) areas. The two inventories show no 
significant correlation at their native 0.1° x 0.1° resolution (r = 
0.06). The correlation increases to r = 0.42 at 0.5° x 0.5° 
resolution and r = 0.63 at 1.0° X 1.0° resolution. 

Previous inverse studies for US methane emissions using 
EDGAR as a priori estimate have all found the need for a large 
upward correction of emissions in the South-Central US.°”~”° 
Figure 3 shows the distributions of livestock, oil/gas systems, 
and waste emissions for that region in the gridded EPA and 
EDGAR v4.2 inventories. The EDGAR v4.2 inventory places 
the oil/gas emissions in urban areas and completely misses 
areas of production. The oil/gas emissions in EDGAR v4.2 are 
strongly correlated with waste emissions because both are 
largely distributed following population. An inversion using 
EDGAR v4.2 as a priori estimate would not be able to separate 
the two and might wrongly attribute a source in oil/gas 
production regions to livestock. This stresses the importance of 
using a high-quality a priori inventory in inverse analyses, both 
to regularize the solution and to enable interpretation of results. 
Whereas different source types show spatial correlation in the 
EDGAR v4.2 inventory because of mapping to common 
databases, there is no such correlation between source types in 
our gridded EPA inventory even at 1° X 1° resolution. This 
separation between individual source types holds promise for 
interpreting results from inverse analyses. 

Error characterization is necessary for a gridded emission 
inventory to serve as a priori estimate in Bayesian inversions 
and to interpret results from the inversions. Error character- 
ization is not available for the EDGAR v4.2 inventory and 
inversions have typically assumed 30—100% uniform error 
based on expert judgment, or used the inversion to estimate the 
error in the a priori.’’ The GHGI includes detailed error 
characterization on its national totals for individual source 
types, based on propagation of uncertainties in the construction 
of the bottom-up estimates (Table 1). Errors in our 0.1° x 0.1° 
gridded inventory may be larger because of local uncertainties 
in activity data and emission factors, including the precise 
localization of emissions. For the same reason, averaging our 
inventory over coarser grids (by adding contributions from 0.1° 
X 0.1° grid cells) could reduce the error. This scale dependence 
is important to describe because inversions may seek to 
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Figure 2. Total methane emissions in the gridded EPA inventory for 
2012 (top), EDGAR v4.2 for 2008 (middle), and difference between 
the two (bottom). 


optimize emissions at different spatial resolutions depending on 
the information content of the atmospheric observations. 
Here we derive scale-dependent error statistics for our 
gridded EPA inventory by comparison to a detailed bottom-up 
emission inventory compiled by Environmental Defense Fund 
(EDF) for the ~300 x 300 km” Barnett Shale region in 
Northeast Texas by Lyon et al.” and subsequently updated 
with top-down constraints by Zavala-Araiza et al.’ The EDF 
inventory was constructed largely independently from the 
GHGI. It is based on an extensive field campaign in the region 


in September—October 2013 including measurements of 
individual facilities as well as regional surveys.'* The Barnett 
Shale region is of particular interest as a comparison standard 
because it includes diverse sources: the largest oil/gas field in 
the CONUS (30 000 active wells), major livestock operations, 
and the metropolitan area of Dallas/Fort Worth. The EDF 
inventory incorporates considerable local information that goes 
beyond the databases used in constructing our inventory, and 
including for example precise locations of dairy farms, gas 
gathering stations, and landfills.’* Emissions are reported on a 4 
x 4 km’ grid (approximately 0.04° x 0.04°) with detailed 
breakdown by source types and statistical sampling of “super- 
emitter” facilities with anomalously large emissions. 

Figure 4 shows emissions from livestock, natural gas, waste, 
and petroleum in the Zavala-Araiza EDF Barnett Shale 
inventory and compares to our gridded EPA inventory. 
Emission totals for the domain are shown in Table 2. There 
is a large difference in the magnitude of the source from oil/gas 
production, at least in part because Zavala-Araiza et al. find a 
larger frequency of superemitters than assumed in the GHGI 
emission factors. Despite this difference in magnitude there is a 
strong spatial correlation on the 0.1° x 0.1° grid (r = 0.78), 
implying that correction to the gridded EPA distribution in an 
inversion of atmospheric data could be reliably attributed to the 
oil/gas production source type, smoothing temporally over 
superemitters. The spatial correlation coefficient of the 
livestock source between the gridded EPA and EDF inventories 
is only 0.37 at 0.1° X 0.1° resolution but increases to 0.88 at 
0.5° X 0.5° resolution. The gridded EPA inventory misses the 
exact locations of farms but this error is smoothed out on the 
county scale. 

We take the Zavala-Araiza EDF Barnett Shale inventory as 
our best approximation of emissions in the region in order to 
derive scale-dependent error statistics for different source types 
that can be used in an inversion of atmospheric concentration 
data. We assume for this purpose that the total error probability 
density function (pdf) for each source type in a given grid cell is 
Gaussian and includes a displacement error due to imprecise 
localization. Our error model is given by 


=x = x’? } 

p (2) 
Here, o(x) is the Gaussian error standard deviation for the grid 
cell centered at location x and for a given source type, a@ is a 
base relative error standard deviation assuming no displacement 
error, E(x’) is the 2-D field of emissions for that source type 
over all grid cells, and f is a length scale for the displacement 
error. @ and # are assumed to be uniform for a given source 
type. We find optimal values for a and / by minimizing a least- 
squares cost function J(a, #) for the difference between our 


estimated error standard deviation and the absolute difference 
between the gridded EPA and EDF emissions: 


J(a, B) = > (o(a, B, x) — |E(x) a Egpp(x)l)” 


o(x) =a 3 E(x’)? os 


(3) 


where the summation is over all grid cells of the Barnett Shale 
domain in Figure 4. Optimization of a and f is done for the 
different source types of Figure 4 (also separating waste as 
landfills and wastewater) and for grid resolutions L from 0.1° to 
0.5° to determine the scale dependence of the error. 0.5° is the 
coarsest scale that can be usefully constrained from the Barnett 
Shale inventory, but from there we can extrapolate to the 
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Figure 3. Emissions from livestock, oil/gas systems, and waste over the South-Central US in the gridded EPA inventory for 2012 and the EDGAR 


v4.2 inventory for 2008. 
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Figure 4. Methane emissions in the Barnett Shale region of Northeast Texas. Values for the four main categories are shown for our gridded EPA 
inventory and for the EDF inventory” at 0.1° X 0.1° resolution. The original EDF inventory is at 4 X 4 km” and is regridded here to 0.1° X 0.1° for 
comparison with our inventory. The location of the Barnett Shale region is shown inset. Emission totals for the region are given in Table 2. 


national scale using the GHGI error estimates. For this purpose 
we take the average of the upper and lower confidence intervals 
for the given source type in Table 1 as representing the relative 
error standard deviation ay on the national scale. We then fit 
our results for a(L) and #(L) to exponential forms of L, with 
asymptote @y for a. This yields 


@ = a, exp(—-k,(L — Ly)) + ay (4) 


» 
| 


= P, exp(—k,(L — Lo)) (5) 


Here Ly = 0.1° is the native resolution of our inventory, and ky 
and ky (in units of inverse degrees) are smoothing coefficients 
that express the scale dependence of the error. The fit is subject 
to the condition @ > 0; if the base error standard deviation 
derived from the Barnett Shale inventory is smaller than ay 
then we assume that a is scale-independent and equal to ay. 
Figure 5 shows the base relative error standard deviation a 
and displacement length scale f as a function of grid resolution 
L for the different source types active in the Barnett Shale. 
Values for all coefficients in eqs 4 and 5 are given in Table 3. 
Base error standard deviations (a) for different source types at 
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Table 2. Regional Methane Emissions (Gg a!)* 


Barnett Shale region California 
source EDF (Lyon) EDF (Zavala-Araiza) this work r CALGEM this work r 
oil/gas production 330 436 327 0.78 171 264 0.90 
gas processing 49 65 62 0.24 12 7 0.25 
gas transmission 16 2 8 0.20 22 24 0.69 
gas distribution 10 9 16 0.87 131 39 0.98 
livestock 104 102 122 0.37 721 885 0.46 
landfills 105 99 92 0.76 316 507 0.86 
wastewater 7 7 12 0.21 91 45 0.53 
sum 621 720 640 0.68 1463 1772 0.66 


“Anthropogenic emissions from the Barnett Shale region in Northeast Texas (Figure 4) and from the state of California (Figure 6). Regional totals 
by source type from our gridded version of the gridded EPA inventory for 2012 (this work) are compared to the original bottom-up (Lyon) EDF 
inventory for the Barnett Shale in October 2013,” the updated (Zavala-Araiza) EDF inventory including top-down information,’ and the CALGEM 
inventory for California in 2008 (livestock/waste)°*”* and 2010 (oil/ gas).’° Also shown are spatial correlation coefficients r on the 0.1° X 0.1° grid 
for the Barnett Shale’”* and 0.2° x 0.2° for California. 
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Figure 5. Relative error standard deviations for methane emissions from individual source types and their scale dependences. The figure shows the 
error parameters a and f used in eq 2 to calculate the absolute error standard deviations for a given L X L grid cell and source type as a function of 
the grid resolution L. The native grid resolution of the inventory is 0.1° X 0.1°, and averaging over coarser scales decreases errors for individual 
source types as described by exponential decay functions (eqs 4 and 5). The asymptotes for the base error standard deviations are the national values 
@y shown as tick marks on the right side of the left panel. Values for all error parameters are given in Table 3. 


Table 3. Error Parameters for the Gridded EPA Emission displacement error). Because of its Gaussian form, it 
Inventory emphasizes the effect of neighboring misplacements; it would 
rem mn a aD by re not capture the error from a distant misplacement or from a 
ee rans 7 ces : completely missing source. co. 
diatinall gasseystenas 028 Ag 035 009 39 We recommend that users of our emission inventory at a 
landfills F ae dis Ja given grid resolution L apply the error parameters in Table 3 
eee erry ner ae 0.78 14 021 0.06 69 nationally to derive a(L) and #(L) from eqs 4 and 5, and from 
petroleand aystens 0 0.87 0.04 197 there use eq 2 to derive the absolute error standard deviation o 
“Error parameters for use in eqs 4 and S to compute the base relative for individual source types and grid cells. For source types not 
error standard deviation a(L) and displacement error length scale constrained by the Barnett Shale inventory, we assume here 
P(L) for different source types at different grid resolutions L x L. The that the base error standard deviation at 0.1° X 0.1° resolution 
resulting values of a(L) and f(L) should be used in eq 2 to estimate is 2.5 times the national value from Table 1, based on the 


the error standard deviation for a given source type and grid cell. Units 
are degrees for L and fo, and inverse degrees for k,, and k,. a and ay 
are dimensionless. The livestock error estimate is to be applied to the 
sum of enteric fermentation and manure management emissions. 


median scale dependence for the sources in the Barnett Shale. 
We use median values of the other error parameters in Table 3 
and cap a at 1.0. Error variances for the different source types 
present in a grid cell can be added in quadrature to derive the 
error variance for the total emission in that grid cell. A simple 
variogram analysis’° of the difference between the EDF and 
EPA inventories shows no spatial error correlation, either for 


0.1° X 0.1° grid resolution are all above 50%. Errors for 
livestock, natural gas systems, and wastewater are scale- 
dependent and decrease when coarser grid resolutions are 


used. Errors for petroleum systems and landfills are defined by total emissions or for individual source types, suggesting that 


the national estimates, which are relatively large, and are thus the a priori error covariance matrix needed for a Bayesian 
scale-independent. The displacement error measured by / is inversion can be assumed diagonal. A previous study comparing 
usually very small, less than 0.1°, in part because it is isotropic a disaggregated national inventory for Switzerland to EDGAR 
(there is no a priori information on the direction of v4.2 did find significant spatial error correlations.” 
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Our error model is a first attempt to quantify grid-dependent 
errors for use in inversions of atmospheric concentration data, 
and in that it significantly improves on previous bottom-up 
inventories. It has however a number of weaknesses. First, the 
Barnett Shale region may not be representative nationally and 
offers no error characterization for some sources (in particular 
coal mining). Second, inverse analyses of atmospheric 
observations®””””” suggest that EPA underestimates on the 
national scale maybe be larger than estimated from ay, 
although these inverse analyses have their own errors. Third, 
the assumed Gaussian form for the error pdf is convenient for 
analytical inversions® but is not optimal. It does not exclude 
unphysical negative solutions and it does not capture the “fat 
tail” of the pdf contributed by superemitters.°°”” A log-normal 
error pdf would solve the positivity problem and allow a better 
description of the fat tail. Fourth, some spatial error correlation 
would be expected even though it cannot be detected in our 
simple variogram analysis for the Barnett Shale; more advanced 
variogram analyses and better data sets might enable 
detection.*”° Fifth, we do not consider temporal error 
correlations because inversions typically focus on optimizing 
the spatial distribution while assuming the temporal variation to 
be known (and relatively weak in our case). This may not be 
appropriate for some applications, in particular when 
optimizing emissions from seasonally varying sources.” 

The state of California has developed its own methane 
emission inventory in support of its policy objective to reduce 
greenhouse gas emissions to 1990 levels by 2020.’* Similarly to 
our work here, this California inventory has been disaggregated 
by Zhao et al.,”* Jeong et al.,°° and Jeong et al.” to produce the 
gridded 0.1° X 0.1 ° California Greenhouse Gas Emissions 
Measurement (CALGEM) inventory (calgem.Ibl.gov/ 
emissions). The CALGEM grid is offset by 0.05° from ours, 
so we can only compare them at 0.2° X 0.2° and even then with 
some unresolvable remapping error. Table 2 compares total 
California emissions for individual source types, including 
spatial correlation coefficients. Total state emissions are close 
(1772 Gg a7! in our work and 1463 Gg a7! in CALGEM). 
There is more difference in individual source types but most 
source types show strong correlations between the two 
inventories, suggesting that they could be effectively con- 
strained in an inverse analysis of atmospheric observations. 
Figure 6 compares the spatial distributions of emissions in the 
two inventories, including our (Barnett-based) estimated error 
standard deviation on the 0.2° X 0.2° grid, and adding error 
from different source types in quadrature for a given grid cell. 
We find that 51% of CALGEM emissions are from cells that 
have emission magnitudes within one standard deviation of our 
gridded EPA emissions. The largest differences are from 
livestock emissions, as CALGEM uses more local data to 
distribute these emissions within the large California counties. 

In summary, we have constructed a gridded version of the 
EPA GHGI published in 2016 for US anthropogenic methane 
emissions with monthly 0.1° X 0.1° resolution including 
detailed information on different source types. Our inventory 
includes error characterization for different source types and 
spatial scales, as required for application as a priori estimate in 
inverse analyses. Our inventory is for 2012 emissions but can 
easily be updated to later years as activity data become available. 
Monthly gridded emission fields for all emission subcategories 
in Table 1 are publicly available at www.epa.gov/ghgemissions/ 
gridded-2012-methane-emissions. 
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Figure 6. Methane emissions in California in 2012 from all sources in 
Table 2. The top panels show results from our gridded EPA inventory 
and from the CALGEM inventory on their native 0.1° X 0.1° grids. 
The bottom left panel shows the difference and the bottom right panel 
shows the error standard deviation in our gridded EPA inventory as 
computed with the method described in the text. The sign of the error 
standard deviation in the figure is the same as the EPA-CALGEM 
difference to facilitate visual comparison. Differences and error 
standard deviations are shown on a 0.2° X 0.2° grid to account for 
the 0.0S° offset between the EPA and CALGEM grids. 
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